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Abstract 

We performed a network traffic simulation to clarify the 
mechanism producing self-similar traffic originating in 
the transport layer level. Self-similar behavior could be 
observed without assuming a long-tailed distribution of 
the input file size. By repeating simulations with modi- 
fied TCP we found that the feedback mechanism from the 
network, such as packet transmission driven by acknowl- 
edgement packets, plays an essential role in explaining 
the self-similarity observed in the actual traffic. 

1 INTRODUCTION 

Internet traffic fluctuation is known to show self- 
similarity or long-range dependency ||, ||, ||, []], [ll| . This 
self-similarity is the scale invariant property that the 
burst size of the flow density fluctuation seems to have 
the same tendency at various observation time scales. Re- 
cently, it was pointed out that network traffic behavior 
can be regarded as phase transition phenomena in sta- 
tistical physics |fj, [fl], H, which naturally involves the 
self-similar model. The phase transition^] is character- 
ized by dynamical phase changes between non-congested 
and congested phases, and self-similarity can be observed 
at the critical point between these two phases. 

Similar to the observations, some studies investigated 
the mechanism of the self-similarity observed in net- 



work traffic. In the application layer level, Crovella et 
al. explained self-similar traffic from the viewpoint that 
the sizes of files on the web server have a power-law 
distribution 0]. For the datalink layer, Fukuda et al. 
showed that self-similar Ethernet traffic can be repro- 
duced by the effects of the contention between the nodes 
and of the exponential backoff mechanism at the packet 
collisions^]. Also, Park et al.0 and Feldmann et al.|| 
pointed out that the transport layer functionality (espe- 
cially TCP) strengthens the long-range dependency. Al- 
though they aimed to show the transport layer effect, 
their approaches implicitly assume the long-range depen- 
dency in the application level such as the file size distri- 
bution of the application. Thus, there seems to be little 
understanding of the physical explanation of the trans- 
port layer functionality itself from the viewpoint of the 
generation of self-similar traffic. 

In this paper, we focus on the transport layer effect 
independent of the application level causality. In other 
words, we investigate essential factors for generating self- 
similar traffic in TCP. To clarify them, we performed 
a simple topological simulation using the ns-2 simula- 
tor. The results show that phase transition phenomena 
(and self-similarity) quite similar to those observed in the 
actual traffic can occur in aggregated TCP traffic even 
when input traffic sources have no temporal correlation 
or long-range distribution in file size. We also demon- 
strate that the feedback mechanism (especially acknowl- 
edgement packet driven events) plays an important role 
in generating the self-similarity observed in actual traffic, 
though the rate control and retransmission mechanism 



have less impact on it. 



2.2 Analysis Method 



2 SIMULATION SETUP AND 
ANALYSIS METHOD 

2.1 Simulation Setup 

In this simulation, we used the VINT network simulator 
ns-2, and added some tcl scripts and C++ code. The 
scenario was file transfer from the sender to the receiver 
on the leaf nodes. Figure 1 shows our simple simulation 
topology consisting of three leaf nodes and one router. A 




Figure 1: Network topology. 

connection employed the TCP Reno as the basic trans- 
port protocol, and a TCP connection was established be- 
tween a pair of randomly selected leaf nodes. The connec- 
tion interval times of the connection were exponentially 
distributed. It is important to note that the number of 
packets in a connection followed an exponential distribu- 
tion (mean size = 100 packets) throughout this simula- 
tion. Namely, the distribution of the number of packets 
had no temporal correlation, which means that there was 
no application-level causality. This condition is needed 
to clarify the transport mechanism from the viewpoint of 
the self-similarity. The buffer sizes in the leaf nodes and 
the router were set to 800 packets in most simulation, 
and the packet size was set to 576 bytes. Also, the band- 
width and transmission delay of each half duplex link 
were 500 kbyte/sec and 50 msec, respectively. The large 
buffer size was chosen for easy analysis of the fluctuation 
of packets in the buffer. Our results were obtained from 
several runs of the simulation, each lasting 4800 seconds. 
We were interested in the statistical behaviors of the ag- 
gregated traffic streams when the connection arrival rate 
(to be denoted as "r") to the leaf nodes varied. 



In order to examine the self-similar nature of network 
traffic fluctuation, we focused on two empirical distribu- 
tions, namely the congestion duration and the recurrent 
time of the queue length. 

The congestion duration length distribution is a well- 
known distribution for judging the self-similarity of a 
given time series [|l0| ^3). The congestion state is defined 
by the condition that the output flow density in the ob- 
served link is larger than a certain threshold flow density. 
Then, the congestion duration length (L) is calculated as 
the sequential number of congestion states multiplied by 
the bin size. The cumulative distribution of this duration 
(P(> L)) is a power-law distribution with exponent -1.0 
(P(> L) oc £ -1 ' ) when the original flow fluctuation is 
self-similar (l(J, which is characterized by the 1// type 
power spectrum. Theoretically, the power-law distribu- 
tion is observed independent of the value of the threshold 
if the original fluctuation is self-similar. 

The recurrent time of the queue length is introduced 
as the duration time until the queue length becomes zero. 
The cumulative distribution of such recurrent time obeys 
the same power-law distribution with exponent —1.0 as 
the congestion duration length. 

We checked the congestion duration length of the link 
from the router to leaf node 2 in Figure 1 (denoted by 
A). Also, we observed the queue length at the Router's 
output queue to Leaf node 3. 

3 TRANSPORT LAYER 
EFFECT 

3.1 Real Traffic Flow 

In this subsection, we review the traffic fluctuation in 
an actual network focusing on the self-similarity and the 
phase transition phenomena. 

Figure 2 shows the cumulative distribution of the con- 
gestion duration length in an actual traffic flow)]. This 
traffic trace was collected in the Ethernet link connecting 
the WIDE backbone in Japan and the US west coast for 

4 hours; 80% of the traffic was due to web applications. 
The three curves in the figure indicate the difference in 
mean flow density of the traffic flow. This figure shows 
that for medium flow the distribution is approximately 
the power-law distribution with exponent —1.0 represent- 
ing self-similarity. However, away from the critical point 
the distributions deviate from the power-law distribution. 
When the total amount of traffic is small, the congestion 

1 More detailed analysis is shown in J4J. 
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Figure 2: Congestion duration length in actual network 
traffic. The straight line indicates the slope —1.0. 

duration length obeys an exponential distribution charac- 
terized by the short-time dependency. On the other hand, 
the larger traffic density case denoted by the high flow in 
this figure demonstrates the existence of the large-cluster 
congestion. Consequently, this result clearly shows that 
self-similarity occurs in a special case in the actual net- 
work traffic, and the phase transition view can capture 
all the properties j|, [UJ , 

3.2 TCP Traffic Behavior 

We are interested in the mechanism of generating the self- 
similar traffic observed in the previous subsection from 
the standpoint of the transport layer functionality. This 
subsection explains a numerical simulation with an or- 
diary TCP Reno algorithm. 

Figure 3 shows the cumulative distribution of the con- 
gestion duration length of the aggregated TCP traffic 
flows at link A in Figure f . The threshold value of con- 
gestion level was empirically set to 5000 bytes through- 
out this simulation. The three lines correspond to three 
different connection interval rates (r = 0.5, 2.5, 4.0 con- 
nections/sec). The mean connection duration times were 
1.53, 13.4, and 412.64 sec, respectively. We found that 
the slope of the distribution at r = 2.5 was approximately 
— 1.0, which is recognized as the critical point behavior ( 
the slope of the straight line is —1.0 in the figure). For 
medium connection arrival rate the traffic flow had self- 
similarity like the actual traffic. Also, the plot decays ex- 
ponentially below the critical point (r = 0.5), and it was 
characterized by a stretched curve above the critical point 
(r = 4.0) representing the existence of coarse-grained con- 
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Figure 3: Congestion duration length of TCP Reno. The 
straight line indicates the slope —1.0. 

gestion. These curves are completely consistent with the 
actual traffic behavior. The most significant point in this 
simulation is that this power-law distribution with expo- 
nent — 1.0 could be observed even when the input traffic 
followed an exponential distribution, not assuming a fat- 
tail distribution. Namely, the self-similarity appeared in 
the traffic fluctuation independent of the input file size 
distribution at the critical point. 
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Figure 4: Recurrent time of queue length. The straight 
line indicates the slope —1.0. 

Next, we show the distribution of the recurrent time 
of the queue length at the router's queue (Figure 4). 
Again, the plotted curve followed the power-law distri- 
bution with exponent -1.0 at the critical point (r = 2.5) 



which is the same connection arrival rate as in the conges- 
tion duration length analysis. Also, when the connection 
arrival rate was smaller, the plotted curve exhibited a 
quick decay, and a larger rate led to a more stretched 
distribution due to the large size congestion. 

Additionally, we confirmed that the congestion dura- 
tion length distribution and the recurrent time distribu- 
tion had the same phase transition behavior in all leaf 
nodes (links). 

3.3 Effect of TCP Component 

Section 3.2 clarified that transport functionality (TCP) 
itself plays a role in producing self-similar traffic. Now we 
focus on the generation mechanism of the self-similarity 
in the aggregated traffic fluctuation. This subsection 
explains the results of additional simulations based on 
modified TCP in order to clarify the mechanism of the 
power-law distribution with exponent —1.0 observed in 
the previous subsection. 

The first modification is not to use the slow start al- 
gorithm which increases the transmission rate exponen- 
tially. The modified algorithm employs a linear rate in- 
crement even in the connection starting time, instead of 
the original exponential rate increment. 

Figure 6 shows the congestion duration length distri- 
bution of the linear rate increment case with feed back 
control. The aggregated traffic also exhibited phase tran- 
sition similar to the normal TCP simulation. Thus, 
the phase transition and corresponding self-similarity are 
known to be independent of the details of the increment 
algorithm of the transmission rate. 
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Figure 5: Congestion duration length of linear start TCP. 
The straight line indicates the slope —1.0. 



Next, wc modified the feedback control algorithm. We 
checked two non-feedback control schemes, CBR over 
UDP, and a linear rate increment algorithm over UDP. In 
CBR-over-UDP simulation, the packet inter-arrival dura- 
tion time was set to 20 msec. Also, the linear rate incre- 
ment algorithm includes a method in which the transmis- 
sion rate from the sender increases by 1 for the fixed inter- 
val (150 msec). The latter method is similar to the previ- 
ous linear increment algorithm modified from the original 
TCP. The difference is in the trigger of the packet trans- 
mission event; namely, the packet transmission event is 
based on the fixed-time interval event independent of the 
reception of the acknowledgement packets. The distribu- 
tions of the number of packets to be sent and connection 
arrival duration are exponential as in the previous simu- 
lations. 

The distribution of congestion duration length for non- 
feedback control algorithm is shown in Figure 6. The 
two non-feedback algorithms reproduce similar statistical 
tendencies of the congestion duration length, therefore, 
we only show the result of the linear rate increment al- 
gorithm over UDP. We found that the exponent of the 
power-law distribution had a value, —0.5, clearly different 
from the exponent, —1.0, obtained in the previous sub- 
section, although phase transition behavior is observed 
quantitatively like the previous simulations (the straight 
line in the figure indicates slope —0.5). This type of ex- 
ponent is known for the single buffer system with Poisson 
input [fl2[ . Thus, from this simulation, we can conclude 
that the feedback mechanism is important in generating 
a self-similar fluctuation with exponent —1.0, which is 
observed in actual Internet traffic. 
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Figure 6: Congestion duration length of non- feedback 
algorithm. The straight line indicates the slope —0.5. 



3.4 Effect of Buffer Capacity 
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Figure 7: Packet loss and critical point vs. buffer capac- 
ity. 

Figure 7 plots the connection arrival rate at which 
packet loss is first observed in the system as a function of 
the buffer capacity in the nodes together with the critical 
connection arrival rate showing the power-law distribu- 
tion. 

This figure shows that both the critical point and the 
packet loss point become larger as the buffer capacity 
increases. However, it should be emphasized that self- 
similarity was observed without packet loss event for 
buffer capacity larger than 400, namely, the retransmis- 
sion event is not directly concerned with the generation 
of the phase transition. Moreover, we confirmed that 
there was no timeout event of the retransmission timer 
in above range. It is an evidence that the exponent of 
the power-law distribution is independent of the method 
of the retransmission. Also, the larger buffer capacity 
leads to a larger critical point value, however, the buffer 
capacity itself does not affect the generation of the phase 
transition phenomena. 

4 CONCLUDING REMARKS 



ets using the normal TCP algorithm. In addition, the 
reproduced traffic behavior is consistent with the phase 
transition model. The most significant result is that the 
self-similarity appears even with the exponential input 
file size. This indicates that the transport protocol itself 
includes the mechanism generating self-similar traffic. 

Moreover, we clarified that the feedback mechanism, 
especially the packet transmission triggered by the ac- 
knowledgement packet, in TCP is an essential factor 
in generating the self-similarity from the results for the 
modified algorithm. Traffic employing non-feedback ef- 
fect with linear rate incremental algorithm exhibits sim- 
ilar phenomena, however, the exponent of the power-law 
distribution, —0.5, is inconsistent with that of the TCP 
with linear rate increment algorithm, —1.0. This indi- 
cates that an essential factor is the acknowledgement- 
driven packet transmission rather than the timer-driven 
one. We also confirmed in both the normal and modified 
TCP simulation that the distribution of the inter-packet 
arrival of the acknowledgement packet in a TCP connec- 
tion has a power-law distribution at the critical point 
(between 0.01 - 1.0 seconds). Namely, the origin of the 
self-similar traffic is likely due to the feedback mechanism 
from the network such as the delay of acknowledgement 
packets. 

Finally, we showed that the retransmission mechanism 
and buffer capacity have less impact on the generation of 
self-similarity (1/f type fluctuation). We concluded that 
these effects only work to stretch the time spent stay- 
ing in the self-similar state. The network state seems 
to have been heavily congested at the critical connec- 
tion arrival rate in our simulation. However, there are 
congested routers in actual wide area networks, and our 
results indicate that if a flow passes through the router 
in the critical state once, the flow can be self-similar. 

Our simulation results have extracted the essence of 
the origin of self-similar traffic from an actual complex 
network system (both topologically and algorithmically) . 
The future direction of this research will be to support 
the development of a more effective congestion control 
algorithm based on the knowledge obtained from these 
simulations. 



In this paper we focused on a simple mechanism for gener- 
ating the self-similarity observed in actual network traffic 
at the transport layer. In order to clarify the essence of 
the mechanism, we performed simulations with simple 
settings such as exponential file size, fixed-size packets, 
and the simple topology. 

We showed that the self-similar traffic is observable in 
these simple settings when the traffic sources send pack- 
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